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Abstract 

Optimal signaling for secrecy rate maximization in Gaussian MIMO wiretap channels is considered. 

While this channel has attracted a significant attention recently and a number of results have been 
obtained, including the proof of the optimality of Gaussian signalling, an optimal transmit covariance 
matrix is known for some special cases only and the general case remains an open problem. An 
iterative custom-made algorithm to find a globally-optimal transmit covariance matrix in the general 
case is developed in this paper, with guaranteed convergence to a global optimum. While the original 
optimization problem is not convex and hence difficult to solve, its minimax reformulation can be 
solved via the convex optimization tools, which is exploited here. The proposed algorithm is based 
on the barrier method extended to deal with a minimax problem at hand. Its convergence to a global 
optimum is proved for the general case (degraded or not) and a bound for the optimality gap is given 
for each step of the barrier method. The performance of the algorithm is demonstrated via numerical 
examples. In particular, 20 to 40 Newton steps are already sufficient to solve the sufficient optimality 
conditions with very high precision (up to the machine precision level), even for large systems. Even 
fewer steps are required if the secrecy capacity is the only quantity of interest. The algorithm can be 
significantly simplified for the degraded channel case and can also be adopted to include the per-antenna 
power constraints (instead or in addition to the total power constraint). It also solves the dual problem 
of minimizing the total power subject to the secrecy rate constraint. 
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I. Introduction 

Wide-spread use of wireless systems has initiated significant interest in their security and 
related information-theoretic studies [[U. Secrecy capacity has emerged as a key performance 
metric, which extends the regular channel capacity to accommodate the secrecy requirement. 
Wyner’s wire-tap channel (WTC) |[Il-[|3l| is the most popular model to accommodate secrecy, 
which was extended to the Gaussian channel dU and subsequently to the Gaussian multiple-input 
multiple-output (MIMO) setting [151-dH; the reader is referred to dll for a detailed discussion of 
this model and extensive literature review. The Gaussian MIMO WTC has been recently a subject 
of intense study and a number of results have been obtained, including the proof of optimality 
of Gaussian signaling dD, lEl-dSl- While the functional form of the optimal (capacity-achieving) 
distribution has been established, significantly less is known about its optimal covariance matrix 
(the only remaining parameter to completely characterize the distribution since the mean is 
always zero). 

The optimal transmit covariance matrix under the total power constraint has been obtained 
for some special cases, e.g. low/high SNR, multiple-input single-output (MISO) channels, full- 
rank, rank-1 or weak eavesdropper cases, or the parallel channel lISl- dT^ . but the general case 
remains illusive. The main difficulty lies in the fact that the underlying optimization problem is 
in general not a convex problem. It was conjectured in dll and proved in dll using an indirect 
approach (via the degraded channel) that the optimal signaling is on the positive directions of 
the difference channel (where the legitimate channel is stronger than the eavesdropper one). A 
direct proof based on the necessary Karush-Kuhn-Tucker (KKT) optimality conditions has been 
obtained in dH- A weaker form of this result (non-negative instead of positive directions) has 
been obtained earlier in [jH. In the general case, the rank of an optimal covariance matrix does 
not exceed the number of positive eigenvalues of the difference channel matrix dT4l . An exact 
full-rank solution for the optimal covariance has been obtained in dT4l and its properties have 
been characterized. In particular, unlike the regular channel (no eavesdropper), the optimal power 
allocation does not converge to uniform one at high SNR and the latter remains sub-optimal at 
any finite SNR. In the case of weak eavesdropper (its singular values are much smaller than 
those of the legitimate channel), the optimal signaling mimics the conventional one (water-filling 
over the channel eigenmodes) with an adjustment for the eavesdropper channel. The rank-one 
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solution in combination with the full-rank one provides a eomplete solution for the ease of two 
transmit antennas and any number of reeeive/eavesdropper antennas. The 2-2-1 ease (2 transmit, 
2 reeeive, 1 eavesdropper antenna) has been studied earlier in [[TOll and the MISO ease (single¬ 
antenna reeeiver) has been eonsidered in IfTTlIlfT^ and settled in lfT3]||f5l . for whieh beamforming 
is optimal and whieh is also the ease for a MIMO-WTC in the low SNR regime. The ease of 
isotropie eavesdropper is studied in detail in ifTSll . ineluding the optimal signaling in an explieit 
elosed form and its properties. This ease is shown to be the worst-ease MIMO wire-tap ehannel. 
Based on this, lower and upper (tight) eapaeity bounds have been obtained for the general ease, 
whieh are aehievable by an isotropie eavesdropper. The set of ehannels for whieh isotropie 
signaling is optimal has been fully eharaeterized ifTSll . It turns out to be more rieher than that of 
the eonventional (no eavesdropper) MIMO ehannel. A elosed-form solution was obtained in |[T^ 
for the ease of weak eavesdropper but otherwise arbitrary ehannel; its optimal power alloeation 
somewhat resembles the water-filling but is not identieal to it. For the ease of parallel ehannels, 
independent signaling is optimal ifTTlllfT^ . whieh implies that the optimal eovarianee matrix is 
diagonal; the eorresponding optimal power alloeation ean be found in ifT^ . This also implies that 
the eigenveetors of optimal eovarianee matrix are the same as the right singular veetors of the 
legitimate or eavesdropper ehannels when the latter two are the same ifT^ and the eorresponding 
power alloeation is the same as in ifT^ . The low-SNR regime has been studied in detail in [[T^ . 
In partieular, signaling on the strongest eigenmode(s) of the differenee ehannel matrix is optimal. 
Little is known beyond these speeial eases and the general ease is still an open problem. 

While numerieal algorithms have been proposed in ll20ll . ETI to eompute a transmit eovarianee 
matrix for the MIMO-WTC, their eonvergenee to a global optimum has not been proved. The 
main diffieulty lies in the faet that the underlying optimization problems are not eonvex and 
henee KKT eonditions are not suffieient for optimality lf2^ . In partieular, while the alternating 
optimization algorithm in [l20ll is shown to eonvergenee to a KKT (stationary) point, it is not 
neeessarily a global maximum (due to the above reason); it may, in faet, be a saddle point or a 
local rather than global maximum of the seereey ratqll and it is not known how far away it is 
from the global maximum. This remark also applies to the algorithms eonsidered in ETll . [l22l . 


'For non-convex problems, KKT point can also be a local minimum rather than maximum. This is ruled out in II20I hy the 
non-decreasing nature of the generated sequence of objective values. 
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The purpose of this paper is to develop a numerieal algorithm for computing a globally-o^iimdil 
covariance matrix in the general case, i.e. for the general Gaussian MIMO-WTC (degraded or 
not), with guaranteed convergence to a global optimum, and to prove its convergence. This is a 
challenging task as the underlying optimization problem is not convex so that standard tools of 
convex optimization cannot be used; in general, non-convex problems are much harder to solve 
lf2^ . We deal with this challenge by using the minimax representation of the secrecy capacity 
found in While this representation appears to be more complicated than the standard one (the 
former involves two conflicting optimizations while the latter - only one), it turns out to be much 
easier to solve, at least numerically, as we demonstrate using the primal-dual representation of 
Newton method in combination with the barrier method. The main advantage of this approach 
is that each of the two problems is convex, the saddle-point property holds and hence the 
respective KKT conditions are sufficient for global optimality (Slater’s condition holds as well). 
A conceptually-similar approach has been used before for optimizing the transmitter with per- 
antenna power constraints in the regular (no secrecy) MIMO broadcast channel in [|25]| . Our 
custom-made algorithm essentially solves the KKT optimality conditions (see e.g. If2^ for a 
background on these conditions), which are sufficient for the minimax problem at hand, in an 
iterative way using the primal-dual representation of Newton method in combination with the 
barrier method (to accommodate inequality constraints) adopted to the MIMO WTC setting, see 
Section V. A proof of the algorithm’s convergence to a global optimum is also provided for 
the general case. While we formulate the algorithm for the total power constraint, it can be 
easily modified to accommodate other forms of power constraint, e.g. maximum per-antenna 
constraint (instead or in addition to the total power constraint), and also to solve a dual problem 
of minimizing the total transmit power under the secrecy rate constraint. 

A key part of the convergence proof for our algorithm involves a proof of non-singularity of 
the KKT matrij^ so that Newton steps are well-defined for all iterations of the algorithms and 
they generate a sequence of norm-decreasing residuals and hence converge to a globally-optimal 
point (i.e. a solution of the KKT conditions which corresponds to zero residual). This is a difficult 
task since the underlining optimization problems involve both maximization and minimization 

singular KKT matrix would imply that the corresponding Newton step is not defined and thus the algorithm would terminate 
without converging to a global optimum. 
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and the corresponding KKT matrix is indefinite so that the regular tools developed for positive 
semi-definite matrices [l26ll do not apply. A block-partitioned factorization of the KKT matrix is 
used to accomplish it. This is explained in Section V, which also gives a bound on the optimality 
gap for each step of the barrier method. Numerical examples in Section VII demonstrate fast 
convergence of the algorithm: 20 to 40 Newton steps are already sufficient to achieve a very high 
precision (up to the machine precision level), even for large system. Even less steps are required 
if the secrecy capacity is the only quantity of interest. Section VI demonstrates that significant 
simplifications in the algorithm are possible for a degraded channel. Section IV gives a brief 
review of the barrier and Newton methods for inequality-constrained optimization, and presents 
an algorithm for minimax problems with guaranteed convergence to a global optimum. Section 
III summarizes the minimax representation of the secrecy capacity on which our algorithm is 
based. Section 2 reviews the Gaussian MIMO-WTC model and its secrecy capacity. 

II. Wire-Tap Gaussian MIMO Channel Model 

Let us consider the standard Gaussian MIMO wire-tap channel model, 

yi = Hix + ^i, y2 = H2X-F|2 (1) 

where x = [xi,X 2 , ■■■Xm\' G is the (real) transmitted signal vector of dimension m x 1, ' 
denotes transposition, yi( 2 ) G ^re the (real) received vectors at the receiver (eavesdrop¬ 

per), is ths additive white Gaussian noise at the receiver (eavesdropper) (normalized to 
unit variance in each dimension), Hi( 2 ) G is the ni( 2 ) x m matrix of the channel gains 

between each Tx and each receive (eavesdropper) antenna, 12 ^ 2 ) and m are the number of Rx 
(eavesdropper) and Tx antennas respectively. The channels Hi( 2 ) are assumed to be quasistatic 
(i.e., constant for a sufficiently long period of time so that the infinite horizon information theory 
assumption holds) and frequency-flat, with full channel state information (CSI) at the Rx and 
Tx ends. A secrecy rate is achievable for this channel if (i) the receiver is able to recover the 
message with arbitrary low error probability (reliability criterion) and (ii) the information leaked 
to the eavesdropper approaches zero asymptotically (secrecy criterion) [H]. 

For a given transmit covariance matrix R = Elxx'}, where E{-} is statistical expectation, 
the maximum achievable secrecy rate between the Tx and Rx (so that the rate between the Tx 
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Fig. 1. A block diagram of the Gaussian MIMO wiretap channel. Full channel state information is available at the transmitter. 
Hi( 2 ) is the channel matrix to the legitimate receiver (eavesdropper); x is the transmitted signal and yi( 2 ) is the received 
(eavesdropper) signal; Ci( 2 ) is the AWGN at the receiver (eavesdropper). The information leakage to the eavesdropper is 
required to approach zero asymptotically. 


and eavesdropper is zero) is ll6l-|[^ 

where negative C(R) is interpreted as zero rate, W* = H'Hj, and the seereey eapacity subject 
to the total Tx power constraint is 

Cs = max(7(R) s.t. trU < Pt (3) 

R>0 

where Pt is the total transmit power (also the SNR since the noise is normalized). It is well- 
known that the problem in ([3]) is not convex and hence very difficult to solve in general and 
explicit solutions for the optimal Tx covariance is not known for the general case, but only for 
some special cases, e.g. low/high SNR, MISO channels, full-rank or rank-1 case O-flUl or for 
the parallel channel 

Since dS]) is not a convex problem in the general case, not only widely-used Karush-Kuhn- 
Tucker optimality conditions are not sufficient, but also the convergence of a numerical algorithm 
to a global optimum is very difficult if not impossible to insure since the standard tools of convex 
optimization fail to work and, in general, non-convex problems are much harder to deal with 
lf23]l . Thus, ([3]) is very difficult to solve either analytically or numerically in the general case. 
Even when C(R) is concave so that the problem becomes convex (when the channel is degraded, 
Wi > W 2 ), its analytical solution is not known, except for the special cases noted above, and 
the known convex solvers [[30l - [l3^ are not able to solve the problem, even in this convex setting 
so that a custom-made algorithm has to be developed. 
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To go around this difficulty, we use the following minimax representation of the seereey 
eapaeity. 


III. Minimax Representation of Secrecy Capacity 

A minimax representation of the seereey eapaeity was obtained in 10 via a ehannel enhanee- 
ment argument and a clever bounding technique, which is instrumental for our algorithm and is 
summarized below. 


Theorem 1 (Theorem 1 in 0). The secrecy capacity of Gaussian MIMO-WTC channel in ([2]) 
can be presented in the following minimax form : 


Cs = max min f(R, K) = min max f(R, K) 


where 




K = 


I K( 


I + W 2 RI 

> 0, H = 


21 


> C'(R), 

Hi 
H2 


K 21 I 

and the optimization is over the set S of all feasible R, K.- 

S = {(R, K) : trU < P, R, K > 0, K is as in @}. 


(4) 

(5) 

( 6 ) 

(7) 


The upper bound in © via /(R, K) was obtained from a genie-aided reeeiver whieh knows 
y 2 (in addition to yi) and K represents noise eovarianee between and ^ 2 - Minimization over 
K is due to the fact that the true eapaeity does not depend on K while the upper bound does so 
it’s natural to seek the least upper bound. This bound ean also be used in a numerieal algorithm 
to evaluate the optimality gap with respeet to minx for eaeh R. In faet, ® states that letting the 
receiver to know y 2 in addition to yi does not inerease the secrecy capacity under the worst-case 
noise covariance, whieh is rather surprising. 


Remark 1. 2nd equality in dH) expresses the saddle-point property, whieh is equivalent to the 
following inequalities (see e.g. 

/(R,K*)</(R*,K*)</(R*,K) (8) 
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which hold for any feasible R, K, where (R*,K*) is the optimal (saddle) point of (HI). These 
inequalities follow from von Neumann minimax Theorem sinee /(K, R) is eonvex in K for 
any fixed R and eoncave in R for any fixed K (and for any ehannel, degraded or not), and the 
feasible set in (|7]) is eonvex. 

Remark 2. It is the convex-eoneave nature of /(R, K) along with the saddle-point property 
in ([8]) and the eonstraints in (|7]) that make the respeetive KKT eonditions suffieient for global 
optimality (see e.g. and [[^ for more details; note that Slater’s eondition holds for these 
problems). This eannot be said about the original problem in dS]). The suffieieney of the KKT 
eonditions is the key for our algorithm and a proof of its convergence to a global maximum 
(rather than just a stationary point). 

While the equivalence of dH) and dl]) was established in |l6l, an analytieal solution of any 
one is not known in the general ease. In faet, no analytieal solution is known for the latter. 
Despite its more complieated appearance due to two eonflicting optimizations, dH) is in faet 
easier to solve than d3]), at least numerically, since both optimizations are eonvex and the 
respeetive KKT eonditions are suffieient for global optimality; a proof of convergence of the 
eorresponding numerical algorithm to a global optimum is also within reach for any channel. 
While the standard tools developed for single eonvex optimization ll2^ do not apply direetly 
here due to two eonfiieting optimizations involved, their primal-dual reformulation does work, 
as explained below. 

We proeeed to solve the minimax problem in dll) via KKT conditionj^. Subsequently, a 
numerical algorithm is developed with guaranteed eonvergenee to a global optimum for any 
ehannel, degraded or not, whieh is not possible for dl]) due to its non-eonvex nature in the 
general case. The Lagrangian for the problem in dll) is 

L = /(R, K) - frMiK + trMaR - A(frR - P) 

+ fr A(K - I) (9) 

where Mi, M 2 > 0 are (matrix) Lagrange multiplies responsible for the positive semi-definite 
constraints K,R > 0, A > 0 is (scalar) Lagrange multiplier responsible for the total power 

^See e.g. 1231 for a background on KKT conditions. 
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A = 


( 10 ) 


Ai 0 
0 Aa 

is a (matrix) Lagrange multiplier responsible for the eonstraint on K as in There are two 
sets of KKT eonditions - one per optimization in ([4]). For the maximization over R, the KKT 
conditions are (to simplify notations, we have omitted the | factor): 

WrL = (I + WR)"iW - (I + W2R)“^W2 + M 2 - ai 

= 0 , ( 11 ) 

M2R = 0, (12) 

trR <P, R, M 2 > 0, A > 0, (13) 

where Vr is the gradient (derivative) with respeet to R and W = The KKT conditions 

for the minimization over K are 


VifL = (K + Q)-i-K-i-Mi + A = 0, (14) 

MiK = 0, (15) 

K,Mi>0, (16) 

and K, A are as in ®, (fTOl) : Q = HRH'. Here, we implicitly assume that K > 0. While 

the singular case was treated in a separate way in we do not need a separate treatment 
here sinee our numerieal algorithm is iterative and, at eaeh step, it produees a non-singular K 
whieh, however, may be arbitrary elose to a singular matrix (i.e., may have arbitrary small but 
positive eigenvalues). This models numerieally a case of singular K and is a standard feature 
of the barrier method in general, where the boundary of the constraint set can be approached 
arbitrary elosely but never aehieved (see e.g. Chapter 11 in Il23]| for more detail). We remark 
that negligibly-small eigenvalues ean be rounded off to 0 and they also imply that the numerieal 
rank is low. 

An optimal point in (U) must satisfy both sets of KKT conditions simultaneously and these 
conditions are also suffieient for global optimality, as noted above. An analytical solution to 
these eonditions is not known. Our numerical algorithm in Section V solves these two sets of 
KKT eonditions in an iterative way, with guaranteed eonvergenee to a globally-optimal point. 


April 16, 2015 


DRAFT 


10 


IV. Barrier Method for Minimax Optimization 


In this section, we first give a brief introduction into Newton and barrier methods for inequality- 
constrained optimization; the reader is referred to Chapters 9-11 of ll2^ for more details and 
background information. These two methods are used as key components to construct an algo¬ 
rithm for minimax optimization. Subsequently, this algorithm is adapted to the secrecy problem 
in dll) and its guaranteed convergence to a global optimum is proved for any channel (degraded 
or not) in Section V. 

A. Minimax problem via primal-dual Newton method 

Newton method for an equality-constrained problem essentially transforms the problem into a 
sequence of quadratic problems for which the sufficient KKT conditions are a system of linear 
equations [|2^ . 



Let us consider the minimax problem of the fori 



(17) 


where vectors x, y represent optimization variables, the objective /(x, y) is concave in x and 
convex in y; given matrices A^., and vectors ba-, represent the equality constraints for 
each variable. The KKT onditions for this problem are 


Va;/ -f A^Aa: = 0, Aa;X - ba, = 0, 

S/yf + AyXy = 0, Aj^y - bj^ = 0, 


( 18 ) 


where Aa;, \y are dual variables, and they are sufficient for global optimality. 

While the standard Newton method can be used for both optimizations, a proof of its con¬ 
vergence is challenging since the objective is not monotonous (it decreases in one step and 
increases at the other). The residual form of the Newton method is preferable since, as it 
was observed in lf2^ . it reduces the norm of the residual at each step and thus generates a 
monotonous sequence whose convergence to zero can be guaranteed. To introduce this method, 

^A similar problem, without equality constraints, have been briefly considered in (23). More details can be found in l33l . Our 
development here is tailored to be used for the secrecy problem in ©. 
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let us aggregate variables, derivatives and parameters as follows: 


z = 


A = 


V/ = 


X 

y 

0 

0 A 

V./ 

V,/ 


, A = 


Ax 

A,, 


b = 


y 


, vV = 


vL/ 


(19) 


( 20 ) 


The KKT eonditions in (fTSl) ean be east in a residual form: 

r=[(V/ + A'A)',(Az-b)T = 0. (21) 

The Newton method iteratively solves r = 0 using Ist-order approximation (Newton step): 

r(wo + Aw) = r(wo) + DrAw + o(Aw) 

~ r(wo) + DrAw (22) 

where w = [z', A'] is the veetor of aggregated (primal/dual) variables, wq and Aw are its initial 


value and update, -Dr is the derivative of r(w): 

Dr = 


' dv dr' 


VV(zo) A' 

dz' ’ d\' 


A 0 


= T 


and T is the KKT matrix. Now, setting r(wo + Aw) = 0 and solving for Aw from 
the update 


Aw : TAw = —r(wo) 


(23) 
gives 

(24) 


We further show in Seetion V that T is non-singular for our problem so that this system of 
linear equations is guaranteed to have a unique solution for any set of parameters^ 

Having the steps Aw = (Az', AA')' eomputed, the primal/dual variable updates are 


z = zo + sAz, A = Ao + sAA 

where the step size s is found via the baektraeking line seareh 


(25) 

as in Algorithm 1 below. 


^While Aw = —T ^r(wo) is its analytical solution, it is not computed in practice since computing T ^ is computationally- 
expensive and may result in loss of accuracy for ill-conditioned T, see e.g. 1261 . 
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Algorithm 1 Backtracking line search 

Require: wq, 0 < a < 1/2, 0 < /3 < 1, s = 1. 

while |r(wo + sAw)| > (1 — as)|r(wo)| do s := (3s 

end while 


In this Algorithm, a is a % of the linear decrease in the residual one is prepared to accept 
at eaeh step, and /3 is a parameter eontrolling the reduction in step size at eaeh iteration of the 
algorithm. The Newton method in combination with the baektraeking line seareh is guaranteed to 
reduce the residual norm |r(w)| at each step aeeording to the following residual norm-reduction 
property Il2^ : 


d 

ds 


r(wo + sAw)| = -|r(wo)| < 0, 


(26) 


so that, for sufficiently small s, the residual indeed shrinks at eaeh iteration (unless |r(wo)| = 0, 
whieh implies that wq is optimal). This insures convergence of the algorithm to a global optimum 
sinee KKT eonditions are sufficient for optimality and any loeally-optimal point is automatieally 
globally-optimal as the problem is eonvex. 


Algorithm 2 Newton method for minimax optimization 
Require: zq, Aq, a,/3, e 

repeat 

1. Find Az, AA using Newton step in ([24]). 

2. Find s using the baektraeking line seareh (Algorithm 1). 

3. Update variables: z^+i = z^ + sAz, A^+i = A^ + sAA. 
until |r(zfc+i, Afc+i)| < e. 


Based on this, the Newton method for minimax optimization is as in Algorithm 2. The 
convergence of this algorithm to a global optimum is insured by the eonvex/concave nature 
of the objective, suffieiency of the KKT eonditions in (fTSl) . non-singularity of the KKT matrix 
T at eaeh step (as proved in Seetion V) and the norm-decreasing residual property in (l26l) . 
whieh ensures that the method generates a sequenee of sub-optimal solutions with monotonieally 
deereasing residuals, for whieh the stationary point has zero residual and thus solves the suffieient 
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KKT conditions. While the global optimum point eorresponds to zero residual, |r| = 0 (this is 
equivalent to the KKT eonditions in (fTSl) '). the praetieal version |r| < e of this eondition is used 
in Algorithm 2 as a stopping eriterion. This form of the stopping eriteria is justified by not 
only the residual form |r| = 0 of the KKT eonditions, but also by the norm-decreasing residual 
property in (|2^ . 

As a side remark, we note that this algorithm ean also be used to solve the problem in (fTTI) 
with max and min interchanged, due to the saddle point property. 

B. Barrier method for inequality-constrained problems 

Let us now eombine the barrier method and the minimax method above to eonstruet an algo¬ 
rithm for minimax optimization with equality and inequality eonstraints. Consider the following 
problem with inequality constraints: 

maxmin/(x, y), s.t. A^-x = b^;, A^y = hy, 

X y 

/i(x)<0, /2(y)<0 (27) 

where /i and /2 are the eonstraint funetions. The key idea of the barrier method is to use a soft 
instead of hard eonstraints by augmenting the objective with the barrier functions responsible 
for the inequality constraints so that the new objeetive for the problem in (l27l) becomes: 

/t(x,y) = /(x,y) +'0t(/i(x)) -Mf 2 {y)) (28) 

where we use the logarithmie barrier funetion: 

'ftix) = ^\n{-x) (29) 

and where t is the barrier parameter. The barrier method transforms the inequality-eonstrained 
problem in (1271) into the following problem without inequality constraints: 

maxmin/t(x,y), s.t. A^x = b^, A^^y = b^ (30) 

^ y 

The optimality gap due to this transformation ean be upper bounded as follows. 

Proposition 1. The optimality gap of the barrier method in (l30l) applied to the minimax problem 
in (1271) is as follows: 

\f{^*{t),y*{t))-p*\<l/t (31) 
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where p* is an optimal value of the original problem in (1271) and (x*(t), y*(t)) is an optimal 
point for the modified problem in (l30l) . 

Proof: This is a special case of Proposition 3 below with m = rii = n 2 = 1. ■ 

Thus, by selecting sufficiently high t, one can obtain arbitrary small gap. Newton method is 
used to solve the modified problem with any desired accuracy. 

In practice, the modified problem is solved in an iterative way by selecting first a moderately- 
large value of t, solving the problem, increasing t and using the previous solution as a starting 
point for a new one. In this way, the total number of Newton steps required to achieve certain 
accuracy is minimized If23]|. The algorithm is as follows. 

Algorithm 3 Barrier method 

Require: z, A, e > 0, f > 0, /i > 1 

repeat 

1. Solve the problem in (l30l) using Newton method (Algorithm 2) starting at z, A. 

2. Update variables: z ;= z*(f), A := t := pt. 

until 1/t < e. 


V. Barrier Method eor Secrecy Rate Maximization 

In this section, we use the minimax barrier method above to solve the optimal covariance 
problem in dH) iteratively with guaranteed convergence to a global optimum, which is also 
optimal for ([3]). 

A. Choice of variables 

Since the original variables are positive semi-definite matrices R, K and the barrier method 
above requires vectors, we have two options: 

1. Use all entries of R, K as independent variables via x = t>ec(R), y = fec(K), where 
operator vec stacks all columns into a single vector. Enforce the symmetry constraints R' = 
R, K' = K and the equality constraint on K in (0 via extra equality constraints. 

2. Use only lower-triangular entries of R as independent variables via x = vech{R), where 
vech stacks column-wise all lower-triangular entries into a single column vector, and use only 
K 21 : y = nec(K 2 i). 
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It can be shown that these two options are mathematieally equivalent, i.e. produee exaetly the 
same solutions at eaeh step of Newton method. Option 2 is a preferable ehoiee for implementation 
sinee the number of variables and eonstraints is redueed so that it is more effieient. Therefore, we 
use Option 2 for further exposition. Gradient and Hessian ean be evaluated either numerieally (in 
a standard way) or analytieally as given below. We find the analytieal evaluation to be preferable 
as numerieal one entails a loss of preeision while approaehing an optimal point (this is espeeially 
pronouneed at high SNR, large t and for large systems). 

Sinee the algorithm requires initial point to begin with, we use the following point: 


P 

Ro = —I —!■ xq 
m 


Ko = 


= nech(Ro), 

I ^ yo = 0, 

Ao = 0 


(32) 

(33) 

(34) 


As ean be easily verified, the initial point above is feasible (i.e. satisfies the eonstraints). The 
ehoiee of Rq is motivated by the faet that isotropie signalling does not prefer any direetion and 
thus is equally good a priori for any ehannel. Kq eorresponds to isotropie noise and is motivated 
by the same reason. It should be emphasized that the algorithm eonverges for any (feasible) 
initial point, due to the eonvex nature of the problem, to a global optimum; the differenee is in 
how fast. 

To aeeount for the positive semi-definite eonstraints R, K > 0, the following barrier funetion 
is used 

?/;i(R) = ^ln|R| (35) 

L 

so that the modified objeetive ft is 


ft{R, K) = /(R, K) + V^,(R) - 


(36) 


Note that this requires K, R > 0, i.e. they are strietly inside of the feasible set but ean approaeh 
the boundary arbitrary elosely as t inereases, so that some eigenvalues may beeome arbitrary 
elose to zero (and the numerieal rank may be defieient); this models numerieally the ease of 
singular R and/or K and is a standard feature of the barrier method in general [|23]| . The inequality 
in (15^ makes sure that the optimality gap due to this ean be made as small as desired. In a 
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practical implementation, one ean round off negligibly-small eigenvalues of R to zero to simplify 
implementation. 

After some manipulations (see Appendix for details), the gradients and Hessians ean be 
expressed as: 

VJt = Vyh = D>ec(VK/t), (37) 

= -D^(Zi (g) Zi - Z2 ( 8 ) Z2 

+ (g) R“^)Dm, (38) 

= DU-(K + Q)-^ 0 (K + Q)-^ 

+ (l + ri)K-i0K-')D„, (39) 

Vlyft = -D'^(H'(K + Q)-i 0 H'(K + Q)-')D„, (40) 

where 

Vij/t = Zi-Z2+f-'R-\ (41) 

VKft = (K + Q)"i - (1 + t-^)K-\ (42) 

Zi = (1 +WR)-^W, (43) 

Z 2 = (1 +W2R)"^W2, (44) 

and 0 is a Kroneeker produet, is a x m(m + l)/2 duplieation matrix defined from 
fec(R) = DmVechCR) Il27ll[|28ll . is a x nin 2 redueed duplieation matrix defined from 
dk = D„(ik, where 


dk = vec{dK.), dk = vec{dK. 


21J, 


dK = 


0 dK( 


21 


(45) 


(iK 2 i 0 

and n = Hi + 77 , 2 . It ean be obtained from D„ by removing its eolumns eorresponding to all 
entries of K but those in K 21 . 
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It can be shown (see e.g. [[T4ll ) that using the full available power is optimal. Therefore, 
one ean use the equality eonstraint frR = P instead of the inequality frR < P. The equality 
eonstraint matrix A and veetor b take the following form: 


A = [a', O'], b = P 


(46) 


where 1^ is m x m identity matrix, a = vech(lm), and 0 is nin 2 x 1 zero veetor, i.e. A is a 
row veetor and b is a sealar in our setting. 

With this ehoiee of variables and initial points. Algorithm 3, in eombinations with Algorithms 
1 and 2, ean now be used to solve numerieally the minimax problem in ([4]). 

B. Convergence of the algorithm 

Here, we provide a proof of eonvergenee of the proposed algorithm to a global optimum. First, 
one has to insure that Newton step is well defined for all f, R, K > 0. This, in turn, insures 
that the Newton method produees a sequenee of deereasing-norm residuals (aeeording to (l26l) '). 
whieh eonverge to zero for eaeh t. Consequently, the minimax barrier method applied to our 
problem generates a sequenee of sub-optimal points z*(f) that eonverges to a global optimum (a 
solution of the suffieient KKT eonditions in (fTTI) - (fT^ ') as t inereases, sinee /t(R, K) is eonvex 
in K and eoneave in R and also twiee eontinuously differentiable for eaeh R > 0, K > 0 
(more details ean be found in ll23ll ). 

To make sure that Newton step is well defined for eaeh f, R, K > 0, we demonstrate that the 
KKT matrix for the modified objeetive ft is non-singular, so that the Newton equations have a 
well-defined solution as in (l24l) . 

Proposition 2. Consider the minimax problem in (fTTl) for the objective in (l3^ under the equality 
constraint parameters as in (l46l) . Its KKT matrix 


T 



(47) 


A 0 


is non-singular for each t > 0, R, K > 0. 


Proof: The proof is based on the following three Lemmas. 
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VVt = H = 


(48) 


Lemma 1. The Hessian 

^ -Hn Hi2 
H21 H22 

is non-singular if partial Hessians Hu, H 22 are non-singular, i.e. if Tin, H 22 > 0, where Hu = 
-VL/i, Hi 2 = V%ft. H 21 = H;^ = VlJt, H 22 = VlJt. Furthermore, block (1,1) [H^ijn 
of the inverse H~^ is also non-singular. 


Proof: The proof is complicated by the fact that is indefinite matrix, since ft is 
concave in x and convex in y (i.e. Vl^^ft < 0 > > 0 ), so that the standard proofs tailored 

for positive definite matrices [|2^ do not apply here. However, since Hii,H 22 > 0, it follows 
that 


S 22 = -Hn - H'iH 2 - 2 'H 2 i < 0, 

Sn = H 22 + H 2 iHr/H '1 > 0, (49) 


where 811 ( 22 ) is Schur complement of —Hii(H 22 ), so that the matrix inversion Lemma in 
Proposition 2.8.7 of ifMll applies and one can invert H as follows 1^ 


-Hn H'l 

H 21 H 22 


82 - 2 ' 

Sn'H2iHr/ 


which implies that H is non-singular and that [H 


q—ixj/ TT—1 

■‘^22 ■n-21-'^22 

q-1 

= S 2 - 2 ' < 0. 


11 


(50) 


Lemma 2. The KKT matrix in Proposition 2 is non-singular under the conditions of Lemma 1. 

Proof: We proceed as follows. Since the Hessian = H is non-singular (under condi¬ 
tions of Lemma 1), let us apply the following transformation that preserves the determinant of 


®This idea of the proof was suggested by a reviewer. 
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and observe that 



H 

A' 


I - 

-H-^A' 

T = 







A 

0 


0 

I 


H 


0 




A 

-AH 

-lA' 

? 

|T 

^1 

T| = 

HK-AH-^A') 


(51) 


(52) 


(this follows from the properties of bloek-partitioned matriees and their determinants, see e.g. 

From Lemma 1, |H| ^ 0. Further notiee that AH“^A' = a'[H~^]iia < 0, sinee [H“^]ii < 
0 from Lemma 1 and a 7 ^ 0. Using (l52l) . |T| = |H|(—AH^^A') 7 ^ 0 so that the KKT matrix 
T is non-singular. ■ 

Thus, Lemmas 1 and 2 establish the non-singularity of KKT matrix provided that partial 
Hessians are non-singular. This is indeed the ease as Lemma 3 below shows. 

Lemma 3. Partial Hessian in (1^ and (l39l) are non-singular for each t > 

0, R, K > 0. 


Proof: See Appendix. ■ 

Combining Lemmas 1-3, Proposition 2 follows. ■ 

Thus, Proposition 2 insures that Newton step is always well-defined and henee generates a 
sequenee of deereasing-norm residuals (aeeording to whieh eonverges to zero for eaeh 
f > 0. The next proposition speeifies the optimality gap of the minimax barrier method for a 
given t. 

Proposition 3. For each f > 0, the optimality gap of the barrier method applied to the minimax 
problem in dH) can be upper bounded as follows: 

|/(R*(f), K*(t)) — Cs| < max(m, ni + n 2 )/t (53) 

where R*(f), K*(t) are the optimal signal and noise covariance matrices returned by the barrier 
method for a given t. 
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Proof: Using the bounds for the minimax problem in If3^ and adopting them to the problem 
in dH), one obtains 

max / (R, K* (f)) - m/f < / (R* (t), K* (f)) (54) 

R 

< min /(R*(t), K) + (ni + n 2 )/t 

K 

so that 

f{R*{t),K*{t)) < nnn/(R*(t),K) + (m +^ 2 )/^ 

< max min /(R, K) + (rii + n 2 )/t 

R K 

= Cs + {ni+n2)lt, (55) 

> max/(R,K*(f)) -m/f (56) 

R 

> min max /(R, K) — m/t = Cs — m/t 

from whieh (15^ follows. ■ 

Therefore, using suffieiently large barrier parameter t insures any desired aeeuraey, and f{R*{t), K*(f)) —)■ 
Us as f —)■ cxo. If desired aeeuraey is e, then the stopping eriterion in Algorithm 3 should 
be max(m,ni + n 2 )/t < e (assuming that the Newton method produees suffieiently-aceurate 
solution, whieh is always the ease in praetiee due to its quadratie eonvergenee, see [[23ll ). 

C. Dual Problem 

While the algorithm above is designed to maximize the seereey rate, its optimal eovariance 
also solves the dual problem of minimizing the total transmit power subjeet to the seereey rate 
eonstraint U(R) > Rs, i.e. 

min trR s.t. U(R) > Rs, R > 0 (57) 

This ean be easily shown by eontradietion and observing that 1st inequality in (1571) always 
holds with equality, or by eomparing the respeetive KKT eonditions (whieh are neeessary for 
optimality in both problems), both under the eondition Rg = Cg. 
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D. Per-antenna Power Constraints 

Different forms of power constraint can also be incorporated into the proposed algorithm in 
a straightforward way. In particular, the per-antenna power constraint ru < Pi, where ru is i-th 
diagonal entry of R (power in antenna i) and Pi is the maximum power of i-th antenna, can 
be adopted by eliminating matrix A from the KKT equations and adding m extra barrier terms 
ln(Pj — Tii) representing new power constraints in (1^ . As a starting point, one can use e.g. 
ru = Pi/2. 

In fact, these new constraints can be added to the existing ones as well, representing the 
scenario where not only the total power budget is limited but also the per-antenna powers are 
limited due to e.g. limited dynamic range of power amplifiers. 

The convergence of this modified algorithm to a global optimum can be proved in the same 
way as above (with minor modifications). In particular, one can observe that the new barrier 
terms preserve the non-singularity of the KKT matrix and the convex nature of the problem. 

VI. Degraded Channel 

If the channel is degraded, Wi > W 2 , then C(R) is concave and the corresponding op¬ 
timization problem in dH) is convex. Therefore, the barrier method can be applied directly 
to this problem with guaranteed convergence to a global optimum. This reduces the problem 
complexity since there is no minimization over K so that the number of variables reduces from 
m{m -f l)/2 -I- nin2 to m{m + l)/2, which is a significant improvement when nin2 is large. 

The modified objective (with the barrier term) becomes 

(58) 

the variables are z = x = vech/R) (no y) and the equality constraint parameters are 

A = a' = vech{l), h = P, (59) 

Non-singularity of the KKT matrix, which guarantees well-defined Newton steps, can be 
established following the lines of the analysis in Section V. In particular, one observes that 
Lemmas 1-3 hold. Lemma 3 holds since 

VL/t < 0 ( 60 ) 
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Lemma 1 holds since the Hessian in this case is H = Lemma 2 holds since 

a'H~^a < 0 (61) 

so that the KKT matrix is non-singular and thus KKT conditions have a well-defined solution 
at each step of the barrier method. 

The optimality gap in this case becomes 

\C{R*{t))-Cs\ <m/t (62) 

where R*(f) is an optimal R returned by the Newton method for a given t, i.e. it is smaller 
for the same t than in the non-degraded case (15^ . which is an extra advantage (in addition to 
having less variables). For desired accuracy e, the stopping criterion in Algorithm 3 is m/t < e. 

As a side remark, we note that even though the problem is convex in this case, existing 
convex solvers (see e.g. If30]| - [l33 ) cannot be used to solve it directly since they do not allow 
difference of logarithms or matrix powers in objective/constraint functions, while the algorithm 
above solves it with guaranteed convergence to a global optimum. 

VII. Numerical Experiments 

To validate the algorithm and analysis and to demonstrate the performance of the algorithm, 
extensive numerical experiments have been carried out. Some of the representative results are 
shown below. 

Convergence of the Newton method for different values of the barrier parameter t is demon¬ 
strated in Fig.2 for 

Hi = 

H2 = 

which shows the residual r Euclidian norm versus Newton steps. Even though this channel is not 
degraded, since the eigenvalues of Wi — W 2 are {0.395, —3.293}, the algorithm does find the 
global optimum (this particular channel was selected because it is ’’difficult” for optimization). 
Note the presence of two convergence phases: linear and quadratic, which is typical for Newton 
method in general. After the quadratic phase is reached, the convergence is very fast (water-fall 


0.77 

-0.30 

-0.32 

-0.64 

0.54 

-0.11 

-0.93 

-1.71 


(63) 
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Fig. 2. Convergence of the Newton method for different values of f; m = 2, P = 10, a = 0.3,/3 = 0.5, Hi,H 2 as in ( I63t . 
Note the presence of two convergence phases: linear and quadratic. It takes only about 10 to 20 Newton steps to reach the 
machine precision level. 



0 ) 



Fig. 3. Secrecy rates for the same setting as in Fig. 2. Solid line - via the upper bound in (the lines coincide for different 
t), dashed - via C'(R) in 


region). It takes about 10-20 Newton steps to reach very low residual (at the level of machine 
precision). This is in agreement with the observations in ll2^ (although obtained for different 
problems). 

Fig. 3 shows the corresponding secrecy rate evaluated via the upper bound in ([S]) and the actual 
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Fig. 4. Convergence of the barrier method (incrementally increasing f); m = 5, ni = 712 = 10, P = 10, a = 0.3, P = 0.5, 
/r = 5, Hi, H 2 are randomly generated (i.i.d. Gaussian entries of zero mean and unit variance). It takes about 5 to 10 steps to 
reduce the residual to a very low value of 10~^° for each t. 


achievable rate via (i7(R(t)) in (l2l), where R(t) is an optimal covariance at a partieular step of 
the Newton method and for a given t. As the algorithm converges, they become almost equal if 
t is suffieiently large (in this ease, about 10^...10^). While t has negligible impaet on the upper 
bound, it does affeet significantly the eorresponding (7(R(t)) (since the optimal covariance R(f) 
returned by the barrier method depends on t and C(R) is sensitive to R), so that the ehoiee of 
t is not eritieal if the seereey eapaeity is the only quantity of interest (sinee the upper bound 
is quite tight even for moderate t). However, if a transmitter is implemented with the optimal 
eovariance R(f) returned by the algorithm, it is (7(R(t)) that determines the achievable rate and 
this ehoiee is important. We attribute this faet to higher sensitivity of C'(R) to R eompared to 
that of /(R, K). Similar observations apply to the number of Newton steps required to aehieve a 
eertain performance: if Cs is the quantity of interest, the upper bound converges to it in about 3-5 
steps. However, if implementing R is involved, one should use C(R) and, in addition to proper 
choice of t, it takes about 5... 10 steps to aehieve the eonvergenee. Note that, in both cases, the 
number of steps is not large and the exeeution time is small (a few seeonds). In general, larger 
t and m,ni,n 2 require more steps to aehieve the same aeeuraey. As expeeted, the behavior of 
upper bound is not monotonic while the residual norm does deerease monotonically in each step. 

Fig. 4 and 5 demonstrate the convergence of the minimax barrier method (inerementally 
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tu 


m=5, n,=n2=10 


- upper bound 
■C(R) 


2 - 


1 -/ 


i-H 


i-H 






i-H 


i-H 






10 15 20 25 30 

Newton step 


35 40 45 50 


Fig. 5. Secrecy rates for the same setting as in Fig. 4. Solid line - via the upper bound in 0, dashed - via C'(R) in l|2). Note 
that while the capacity value evaluated via the upper bound converges very fast, significantly more iterations are required for 
convergence of the secrecy rate C(R). We attribute this to the fact that C'(R) is more sensitive to R than /(R, K) is. Also 
note the significantly non-monotonique behavior of the former. 


increasing t) for a larger system (m = 5 , rii = 77,2 = 10). Note that a very low residual value of 
10“^° is aehieved after about 7 Newton steps for eaeh value of t. Using inerementally-inereasing 
t as opposed to a fixed large value results in a smaller number of the total Newton steps required 
to aehieve a given residual value and is less sensitive to system parameters and size. Also observe 
from Fig. 5 that while the upper bound eonverges quite fast (in a few Newton steps), it takes 
significantly more steps for C (R) to converge and the convergence process is signifieantly non¬ 
mono tonie. 

To demonstrate the eonvergenee performanee for different ehannel realizations, Fig. 6 and 7 
show the distribution (histograms) of the number of steps required to achieve the residual of 
10“^° and 10“® for 100 randomly-generated channels (with i.i.d. Gaussian entries of zero mean 
and unit varianee) for m = 4 , rii = 772 = 3 and 777, = 5 , 77 i = 772 = 10 systems. While the aetual 
number of required steps depends on a partieular channel realization, 20 to 40 steps are suffieient 
in most oases. We attribute this to the two-phase behaviour of the algorithm’s eonvergenee: once 
the quadratio (water-fall) phase is reaehed, it takes just a few steps to reduee the residual to a very 
low value (whioh is oonsistent with similar observations in lf2^ . albeit for different problems). 


April 16, 2015 


DRAFT 























26 



# of Newton steps 


Fig. 6. A histogram showing the distribution of the number of Newton steps needed to achieve the residual of via the 

minimax barrier method for 100 randomly generated channels (i.i.d. Gaussian entries of zero mean and unit variance); P = 10, 
a = 0.3, P = 0.5, m = 4, rii = n2 = 3, fo = 100, tmax = 10®, = 10. 



# of Newton steps 


Fig. 7. A histogram showing the distribution of the number of Newton steps needed to achieve the residual of 10 “ 
for 100 randomly generated channels (i.i.d. Gaussian entries of zero mean and unit variance); m = 5, ni = n 2 = 10, 
to = 100, tmax = 10®, At = 10, P = 10, a = 0.3,/3 = 0.5. 


Different ehannel realizations result in a different number of required steps for the linear phase, 
before the quadratic phase is reached, but do not affect much the latter. 
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VIII. Conclusion 


Global secrecy rate maximization for (non-degraded) Gaussian MIMO-WTC has been dis¬ 
cussed. The problem is challenging due to its non-convex nature and no analytical solution is 
known for this setting. While the known numerical algorithms converge to a stationary point 
(which may be a local rather than global maximum or just a saddle point), the algorithm proposed 
herein is guaranteed to converge to a global rather than local maximum. The algorithm is based 
on the minimax reformulation of the secrecy capacity problem (to insure global convergence) 
and the primal-dual reformulation of the Newton method in combination with the barrier method. 
A proof of its global convergence is also given. Numerical experiments indicate that 20 to 40 
Newton steps are sufficient for convergence with high precision (up to the machine precision 
level). Extra power constraints (e.g. maximum per-antenna power) can be easily incorporated in 
the algorithm. The dual problem of total power minimization subject to the secrecy rate constraint 
can also be solved. 
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X. Appendix 


A. Gradients and Hessians 

To derive the gradient and Hessian expressions, we use the tools of matrix differential calculus 
If27]llf2^ . Let us consider /(X) = In |X|, where X > 0 is n x n positive definite matrix. Using 
the perturbation method. 


/(X + dX) = In |X| + In |I + X~^dX| 


(64) 


= /(X) + A.(X-‘dX) - iAj(X-‘ciX) + o(A?) 


i 


= /(X) + tr{X-^dX) - hr{X-^dXX~^dX) + o({A2}) 


Using 


tr{X-^dX) = vec{dXyvec{X-^) 
= dx'D;nec(X“i) 


(65) 
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where dx. = vech{dX), one obtains the gradient V^f = D^nec(X ^). Applying this to 

/(K) = In |I + K^iQI = In |K + Q| - In |K|, (66) 


Vyft follows. Using 

fr(X"VXX^^dX) = uec(dX)'(X"^ 0X~^)uec(dX) 

= dx’D'^{X-^ ® X-^)D„dx 

the Hessian ean be identified as 

vL/ = -d;(x-^(8X-1)d„ 

Applying this to /(K), V^/t follows. 

To derive Vxft and V^/i, use a modifieation of (l64l) for /(R) = In |I + WR 
/(R + dR) = /(R) + tr{ZdlV) — -tr{ZdIi,ZdlV) 

+ 

i 

where Z = (I + WR)~^W, so that 

Writ = D>ec(Z) 


where r = uec/i(R), and 


Vlft = -D'^{Z^Z)Bm 


from whieh (lT7l) . (1^ follow, where we have used the following identities llTTll : 


fr(AB) = t>ec(A')'t>ec(B), 
tr(ABCD) = (necD)'(A ® C')vec{B') 
and the faet that Z is symmetrie, Z' = Z. To derive Vlyft, observe that 

VL/(R,K) = VLln|K + HRH'| 


where dk = vec{dK.), so that one needs to eonsider only 

/(R,K) = ln|K + HRH' 


(67) 

( 68 ) 

(69) 

(70) 

(71) 

(72) 

(73) 

(74) 
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for which the perturbation method gives 

/(R + dR, K + dK) = /(R, K) 

- fr(H'(K + Q)-MK(K + Q)-^HdR) + A/ (75) 

where A/ denotes all other terms (whieh do not affeet the mixed derivatives), from whieh (l40l) 
follows by using vec operator inside the traee. 


B. Proof of Lemma 3 

Observe that Q > 0 so that (K + Q)“^ < and thus 

K-i®K-i-(K + Q)-i®(K + Q)-^ >0 (76) 

(this follows from the properties of Kronecker produets, see e.g. (2911) and 

(1 + 0 - (K + Q)-^ 0 (K + Q)“i 

> 0 > 0 (77) 


Now consider the following quadratie form for any y f 0: 


y'Vj,/iy = y'((l + f-')K^'0K^' 

-(K + Q)-^0(K + Q)-^)y >0 


(78) 


since y = D„y 7 ^ 0 (this follows from the fact that all columns of D„ are linearly independent, 
whieh in turn is implied by linear independence of columns of D„ since it has a full column 
rank (271). Therefore, V^^/i > 0. Non-singularity of Vl^ft can be proved in a similar way. 
First, one observes that W > W 2 : 

W = H'K”^H (79) 


I 

1 

CM 

-1 

1- 

1_ 

1 - 

w 

to 

I 


H 2 


= H'H 2 + (Hi - K'iH2)'(I - K' 1 K 21 )-' 


(80) 


X (Hi - K'^Hs) 

> H' 2 H 2 = W 2 


(81) 

(82) 
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since 2nd term in ([Ml) is positive semi-definite, where we have used the matrix inversion Lemma: 

, I K' 

K-^ = 

K21 I 

(I - K'iK 2 i)-i K'i(K 2 iK'i - I)-' 

(K21K'1 - I)- 1 K 21 (I - K2iK'i)-i 

and the faet that K21K21 < I, K21K21 < I, whieh follows from K > 0 (sinee this implies 
IK 21 I 2 < 1, where | ■ I 2 is the speetral norm, see e.g. Il29l l. Therefore, Zi > Z 2 , whieh follows 
from the following argument when W, W2 are non-singular: 


(83) 


w > W 2 ^ 

(84) 

^ W-1 + R < + R 

(85) 

^ Zi = (W^ + R)-^ 


> (W 2 -' + R)-' = Z 2 

(86) 


When W and/or W 2 are singular, one ean use the eontinuity argument If29ll : use = W-fel > 
0, W 2 e = W 2 -f el > 0 with e > 0, instead of W, W 2 and then take e —)■ 0; sinee both sides 
of the inequality are eontinuous funetions, the result follows. Sinee Zi > Z 2 , it follows that 
Zi <8 Zi > Z 2 ( 8 ) Z 2 and thus 


Zi (g) Zi - Z2 ( 8 ) Z2 -f ( 8 ) > 0 


(87) 


(sinee R ^ ( 8 ) R ^ > 0) from whieh it follows that < 0. 
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